skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Zandi, A"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. As an integral part of qualitative research inquiry, field notes provide important data from researchers embedded in research sites. However, field notes can vary significantly, influenced by the researchers' immersion in the field, prior knowledge, beliefs, interests, and perspectives. As consequence, their interpretation presents significant challenges. This study offers a preliminary investigation into the potential of using large language models to assist researchers with the analysis and interpretation of field notes data. Our methodology consisted of two phases. First, a researcher deductively coded field notes of six classroom implementations of a novel elementary-level mathematics curriculum. In the second phase, we prompted ChatGPT-4 to code the same field notes, using the codebook, definitions, examples, and deductive coding approach employed by the researcher. We also prompted Chatgpt to provide justifications of its coding decisions We then, calculated agreements and disagreements between ChatGPT and the researcher, organized the data in a contingency table, computed Cohen's Kappa, structured the data into a confusion matrix; and using the researcher’s coding as the “gold standard”, we calculated performance measures, specifically: Accuracy, Precision, Recall, and F1 Score. Our findings revealed that while the researcher and ChatGPT appeared to generally agree on the frequency in applying the different codes, overall agreement, as measured by Cohen’s Kappa was low. In contrast, using measures from information science at the code level revealed more nuanced results. Moreover, coupled with ChatGPT justifications of coding decisions, these findings provided insights than can help support the iterative improvement of codebooks. 
    more » « less
    Free, publicly-accessible full text available April 8, 2026